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SOUND REPRODUCTION SYSTEM 

This invention relates to sound reproduction systems, and in particular to 
an improved system for binaural synthesis, that is the generation of sound signals 
such that the pressures at a user's ears correspond to those which would have 
5 existed in the presence of the sound source to be simulated. Such sounds will 
have a true source, which is generally a loudspeaker or array of loudspeakers, but 
seem to the listener to originate from another source, located at the position of the 
source being simulated. This perceived source of the sound is known as a "virtual 
source". 

10 The principle of using a conventional stereo loudspeaker setup for binaural 

synthesis was first conceived by Atal B S, & Shroeder M R, "Apparent sound 
source translater", US Patent 3236949, 1966; and was later further optimised by 
Cooper DH & Bauk J L, "Prospects for transaural recording" Journal of the Audio 
Engineeing Society, VoL37, (3-19), 1989, who introduced the term "Transaural 

15 Stereo". See also Shroeder M R, "Models of Hearing", Proc. IEEE, VoL63 (1332- 
1350), 1975 and Cooper D H, Bauk J L, "Generalised transaural stereo" 93rd AES 
Convention, Preprint 3401, 1992, 

If signals at the ears relating to direction of sound sources can be 
reconstructed accurately, combined with accurate reconstruction of secondary 

20 images such as reflections, compelling spatial immersion could be accomplished. 

In a loudspeaker listening situation, in order to synthesise the correct 
signals to the ears to simulate a sound source at some physical point other than 
the loudspeakers, the signals to the loudspeakers have to be tailored in such a 
way as to reconstruct, at the listener's ears, sound pressure indistinguishable from 

25 those that the ears would have received in a free field setup. The propagation from 
each loudspeaker L1, L2 to each ear of a listener Z is represented in Figure 1, and 
can be characterised by the following matrix equations: 

X l1 |" H 1L H 2L 
_ X rJ |_ H 1R H 2R 

X L is the signal received at the left ear; 
30 X R is the signal received at the right ear; 

is the signal transmitted by the left source (loudspeaker L1); 
Y 2 is the signal transmitted by the right source (loudspeaker L2); 





Y, 




Y 2 _ 



where: 



WO 98/58522 



PCT/GB98/01527 



H 1L = transfer function of left source (loudspeaker L1) to left ear 
H 1R = transfer function of left source (loudspeaker L1) to right ear 
H 2L = transfer function of right source (loudspeaker L2) to left ear 
H 2 r = transfer function of right source (loudspeaker L2) to right ear 
5 Solving for Y with known signals X that describe the sound source at an 

arbitrary point in space should obtain the appropriate signals to be fed to the 
loudspeakers. This equation clearly shows that the signal X is required to be 
filtered through a crosstalk cancellation stage formed by the inverted matrix 
(hereinafter referred to as the crosstalk cancellation matrix) as depicted in the 
10 following equation: 

H 2R ~ H 2L 

_ X R 



-H 



lR 



H 



lL . 



Y 2j (H 1L H 2R )-(H jR H 2L ) 

In theory, such a derivation method for a crosstalk cancellation solution 
could be applied to any set-up of a pair of loudspeakers, whether symmetrical or 
non-symmetrical. 

15 In a conventional crosstalk cancellation configuration for a stereo pair of 

loudspeakers, and for ail transaural systems depicted so far, alt synthesised sound 
images are filtered through a crosstalk cancellation process. However, a stereo pair 
cannot give accurate reconstruction of signals from all directions. A typical stereo 
pair arranged in front of a listener gives accurate simulation only within 

20 approximately ± 100° of the direction the listener is facing. Moreover, if more 
than two loudspeakers are introduced, the crosstalk cancellation technique breaks 
down as the system would result in an indeterminate system (more unknowns 
than equations). Cooper and Bauck extended their generalised transaural theory to 
more than two discrete channels of information which generalised the crosstalk 

25 cancellation to any number of loudspeakers for any number of listeners. However, 
approximate solutions were only given, in an attempt to solve an indeterminate 
system for one listener. 

According to the invention, there is provided a sound reproduction system 
for reproducing sound, the system comprising a plurality of loudspeakers, a 

30 processor capable of determining where, within a defined space, a virtual sound 
source is located and, for each virtual sound source, means for selecting a sub-set 
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of the loudspeakers, said sub-set being selected from the plurality of loudspeakers 
on the basis of the location of the virtual sound source in the defined space, and 
means for applying a cross talk cancellation process to the selected sub-set of the 
loudspeakers. 

5 In another aspect of the invention, there is provided a method of sound 

reproduction for reproducing sound by way of a plurality of speakers, the method 
comprising the steps of determining where, within a defined space, a virtual sound 
source is located and, for each virtual sound source, applying a cross talk 
cancellation process to a sub-set of the loudspeakers, said sub-set being selected 

10 from the plurality of loudspeakers on the basis of the location of the virtual sound 
source in the defined space. 

The plurality of loudspeakers from which the subset is selected allows 
accurate simulation over a greater range of virtual source locations than a single 
pair of loudspeakers could achieve. However, the selection of a subset (preferably 

1 5 a pair) from this larger plurality of loudspeakers allows the crosstalk processing to 
be greatly simplified. The pairwise concept introduced here embraces a finite 
number of independent crosstalk cancellation processes, each identifying with a 
pair of loudspeakers in a multiple speaker array. The derivation of the crosstalk 
cancellation matrix process for each pair is identical to that for a conventional pair. 

20 The number of independent crosstalk cancellation matrix modules which can be 
implemented in such an array is governed by the locations of loudspeakers in the 
multi-loudspeaker array, and the spatial coverage and accuracy achievable by an 
optimised pair of loudspeakers in that array. 

Embodiments of the invention will now be described with reference to the 

25 drawings, in which: 

Figure 1 shows a conventional stereo pair configuration with the 
respective transfer functions from sources to ears as already discussed; 

Figure 2 illustrates four physical point sources with maximum possible 
number of crosstalk cancellation processes; 

30 Figure 3 illustrates a lateral set of four loudspeakers, showing the 

loudspeakers' area of coverage on the lateral plane (the horizontal plane containing 
the ears); 
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Figure 4 illustrates the application of binaurally synthesised signals to 
appropriate crosstalk cancellation processes for the configuration of Figure 3; 

Figure 5 illustrates a three loudspeaker configuration; 

Figure 6 illustrates a five loudspeaker configuration; 
5 Figure 7 illustrates an application of virtual static point sources to 

overcome limitations in available space; 

Figure 8 illustrates another four-loudspeaker configuration; 

Figure 9 shows schematically a pairwise crosstalk cancellation 
implementation circuit for localising five monophonic virtual sources using the four- 
10 loudspeaker layout of Figure 8. 

Figure 2 shows a loudspeaker layout having four loudspeakers L1 , L2, L3, 
L4. It is not in general necessary to implement all the possible pairwise processes, 
as in most configurations only adjacent pairs of loudspeakers are used, but for 
some virtual sources non-adjacent pairs may be selected {as will be seen when 
1 5 discussing Figure 6) so the maximum number of crosstalk cancellation processes 
between pairs of loudspeakers in an array of four loudspeakers is not four, but six, 
or more generally, for an array of n loudspeakers, n(n-1)/2. 

The selection of an appropriate crosstalk cancellation process is governed 
by the direction of the synthesised sound source or sources, i.e. if synthesised 
20 sound images are to emanate from directions which are covered by one pair of 
loudspeakers, the processed directional signals are only applied to that pair of 
loudspeakers and its respective crosstalk cancellation process. If two or more 
sound sources of different directions are to be synthesised and played back via an 
array of multiple loudspeakers, respective crosstalk cancellation process modules 
25 relating to respective pairs of loudspeakers can be implemented to deliver each pair 
of directional signals to the ears, taking note that the process is always performed 
pairwise. 

To illustrate the pairwise concept and the explanation given above, 
consider the lateral setup as shown in Figure 3. The layout consists of a ±30° 
30 frontal pair of loudspeakers L1, L2, and a ±120° rear pair of loudspeakers L3, L4 
(angles of incidence are measured with respect to the direction due front of the 
listener Z). Seven virtual images V1 to V7 are shown emanating from different 
bearings. To deliver correctly each binaurally-synthesised sound signal (carrying 
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directional information) to the listener's ears, each pair of signals is applied to 
crosstalk cancellation process modules of appropriate pairs of loudspeakers which 
cover the location of the sound images. Four areas of coverage are shown, with 
loudspeaker L1,L2 encompassing the frontal sector 31 (± 60°), L1,L3 and L2,L4 
5 for left and right sectors 32, 33 respectively and L3,L4 for rear coverage (sector 
34). The block diagram in Figure 4 illustrates the strategic switching of a number 
of processed signals having left and right components (X L , X R ) as heard at the ears 
to appropriate modules 41, 42, 43, 44, each corresponding to the pair of the 
loudspeakers appropriate to the lateral bearings of these signals. 

10 Translating virtual moving sound sources using the pairwise concept can 

be achieved by correctly switching or directing the synthesised signals to the 
appropriate pairwise crosstalk cancellation process. Using the example shown in 
Figure 3, a sound source can be made to translate from the left sector (32) to the 
frontal sector (31), by first applying the synthesised signal to the crosstalk 

15 cancellation processor 42 for the left sector 32, to give its initial position as well 
as the points of movement within the left sector, depending on the angular step 
size between synthesised sources. Once the image shifts to the next sector, the 
synthesised signals are switched to the crosstalk cancellation processor 41 for the 
front sector 31 to continue projecting the moving source. 

20 The example shown above may appear to suggest that the pairwise 

concept restricts the crosstalk cancellation to within the angle between the pair of 
loudspeakers. However, the angle of coverage, be it lateral or spherical, strictly 
depends on how well a pair of loudspeakers can spatialise within its capability (in 
the sense of localisation accuracy). The following worked examples were taken 

25 from experiments which demonstrate that different paired configurations gave 
significantly different localisation abilities and reveal advantages of some 
unconventional loudspeaker placement over current layout practice. 

An unusual layout, which may seem to be impractical on initial inspection, 
is shown in Figure 5. This has just three loudspeakers L1, L2, L3 (Left, Centre 

30 Front, and Right), arranged at 0° (Centre Front) and ± 90° (Right and Left). It 
displays good imaging ability within the respective loudspeakers' optimised fields 
of coverage as shown in Figure 5. The left and right frontal quadrants 51, 52 
covered by the Left/Centre pair L1/L2 and Right/Centre pair L2/L3 give good static 
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frontal sources even with a distinct degree of head rotation to face the virtually 
positioned source. The unconventional Left/Right pair L1/L3 along the axis of the 
ears gave remarkable rear incidence synthesised images covering the range from 
+ 90° to -90°, even on the onsets of the synthesised sound sample. The Left- 
5 Right ear axis loudspeaker pair L1/L3 not only gives coverage along the rear half of 
the lateral plane (sector 53), but it also encompasses the rear hemisphere, i.e. 
including point sources above or below the lateral plane. 

Another example is illustrated in Figure 6. This illustrates that the 
coverage provided by some paired loudspeakers is limited but, by combining with 

10 several other pairs of loudspeakers in the array, the voids are filled and a desired 
spatialisation is fulfilled. Five loudspeakers are used, arranged at 0° (Centre-Front: 
L2) ± 60° (Right-Front: L3, and Left-Front: L1), and ± 120° (Right-Rear: L4 and 
Left-Rear: L5). The frontal ±60° stereo pair L1/L3 provide poor frontal images in 
the range covering ±10° (sectors 62/63). Addition of the centre-front unit L2 and 

15 implementation of crosstalk cancellation on left-front/centre-front (L1/L2) and 
right-front/centre-front (L2/L3) pairs provides sufficient coverage for sound images 
in these sectors. It can be seen that there is a possibility of extending the 
coverage between the centre loudspeaker L2 and each of the respective front 
loudspeakers L1, L3. The pairwise concept employs a strategy of applying the best 

20 pair available to achieve good localisation and in this case, subjective tests have 
shown that sound images projected at the angles between -10° and -60° (sector 
61) and between +10° and +60° (sector 64) are better localised using the left- 
front/right-front non-adjacent pair L1/L3 than that processed by either the left- 
front/centre-front or centre-front/right-front pairs (L1/L2, L2/L3) . 

25 The pairwise concept is not restricted to just these few loudspeaker 

configurations and locations. The invention delivers a new but yet direct general 
approach to solving three-dimensional sound field spatialisation for multiple 
loudspeaker applications. The loudspeaker array itself may be designed to comply 
with other constraints such as cost (in particular the number of loudspeakers to be 

30 used) and the availability of locations to site the loudspeakers. With such a 
strategy, in general terms, the best localisation effect of a sound source is 
achieved by engaging a crosstalk cancellation process that relates to the most 
appropriate pair of loudspeakers available in the array. This does not restrict to just 
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the direct path of sound sources. Each individual reflection of a sound source could 
be treated as a further virtual source, with a suitable delay with respect to the 
primary source, to simulate a reflected sound. Applying the appropriate crosstalk 
cancellation process to each reflection could accurately render their positions in 
5 space, an essence of an immersive spatial environment. 

The introduction of unconventional loudspeaker locations also reveals 
exceptional rear localisation of sound images and, with another paired 
configuration that has good frontal attributes, gave strong distinction between 
front and rear virtual images therefore eliminating front-back and back-front 
10 ambiguities. 

The ability to project static virtual point sources accurately has great 
contributions to teleconferencing and fully immersive personal workstation 
applications. Further applications also extend to home cinema setup in which the 
loudspeaker positions intended for a cinema need to be simulated. The home 

15 environment is restricted in both the number of available loudspeakers and in the 
availability of positions to place them. Virtual loudspeakers in such a setup could 
be rendered in their respective places as shown in Figure 7. In the example 
illustrated five virtual units 71, 72, 73, 74, 75 are simulated by only three real 
units L1, L2, L3, configured as already described with reference to Figure 5. Two 

20 of the virtual units 74, 75 are located outside the confines of the room R in which 
the loudspeakers L1, L2, L3 and the listener Z are located. This could overcome 
the limitation of physical point sources and available listening space in a room. 
Directional loudspeakers can be used to reduce the volume of sound audible at 
locations away from the listener Z, and in particular at the locations of the virtual 

25 rear surround units 74, 75 outside the room R. 

A simplified example of an implementation of the system, based on a four- 
loudspeaker array with only two sectors, as shown in Figure 8, will now be 
described. Figure 9 shows the array set up with pairwise crosstalk cancellation 
applied to a forward pair L1, L2 set at ±60° and a side pair L3, L4 set at ±90° , 

30 i.e. it is based on the assumption that the forward pair L1, L2 provides the best 
reconstruction of spatialised images in the front sector (Sector 81) and the side 
pair L3, L4 provides the best reconstruction of spatialised images in the rear sector 
(Sector 82). 
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The example depicts five virtual sources X 0 , X v X 2 , X 3l X 4 to be 
spatialised, however the implementation of the pairwise concept does not limit the 
number of input sources. 

The input sources X 0 - X 4 are each first subjected to analogue/digital 
5 conversion in a bank 91 of converters A/D. The input sources are then treated in a 
bank of processors 92 with the appropriate hearing response transfer functions 
(HRTFs), H X ol> Hxor, h xil* h xir> H X 2l< h x2r* Hx3u Hx3Rr H X 4u H X 4R ; where H X ol is 
the HRTF of source X 0 to Left Ear, H XO r is the HRTF of source X 0 to Right Ear, 
etc. The left outputs of the front three sources X 0L , X 1L , X 2 l are then combined in a 
10 combiner 93, and similarly for the right outputs X 0 r, X 1R , X 2 r, {combiner 93a) and 
the two outputs filtered in a processor 94 by the forward pair crosstalk 
cancellation matrix for the reconstruction of virtual images in the front sector 81. 
The remaining two input sources X 3 , X 4 are similarly filtered by the side pair 
crosstalk cancellation matrix (processor 94a) for the reconstruction of virtual 
1 5 images in the rear sector 82. The outputs from the cancellation stages 94, 94a are 
then subject to digital/analogue conversion (D/A) (converters 96) for output to the 
appropriate loudspeakers L1 , L2; L3, L4. 

In the pairwise crosstalk cancellation processes, the following calculations 
are performed: 



20 



for loudspeaker LI: Y1 =( H'2R) X L + (H'2L)X R , where: 

-H2L Jmn H2R 

H'2L = , „ x — ^ H'2R = - 



(H1L • H2R) - (H1R • H2L) (H1L • H2R) - (H1R • H2L) 



for loudspeaker L2: Y2 = ( H'1R) X L + (H'1L)X R , where: 

H1L TTIin -H1R 

25 HU ^ _ x H'1R = - 



(H1L • H2R) - (H1R • H2L) (H1L • H2R) - (H1R • H2L) 

where: H1L = HRTF of Loudspeaker L1 to Left Ear 

H1R = HRTF of Loudspeaker LI to Right Ear 
H2L o HRTF of Loudspeaker L2 to Left Ear 
30 H2R = HRTF of Loudspeaker L2 to Right Ear 
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for loudspeaker L3: Y3 = ( H'4R) X L + (H'4UX R , where: 

-H4L H4R 
H'4L= , -t — ; H'4R = 



(H3L • H4R) - (H3R • H4L) (H3L • H4R) - (H3R • H4L) 



for loudspeaker L4: Y4 = ( H'3L) X R + (H'3R)X L , where: 

H3L - H3R 

5 H'3L = Trrrt — H'3R = 



(H3L • H4R) - (H3R • H4L) (H3L • H4R) - (H3R • H4L) 

where H3L = HRTF of Loudspeaker L3 to Left Ear 
H3R = HRTF of Loudspeaker L3 to Right Ear 
H4L = HRTF of Loudspeaker L4 to Left Ear 
10 H4R = HRTF of Loudspeaker L4 to Right Ear 
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CLAIMS 

1. A sound reproduction system for reproducing sound, the system 
comprising a plurality of loudspeakers, a processor capable of determining where, 

5 within a defined space, one or more virtual sound sources are located and, for each 
virtual sound source, means for selecting a sub-set of the loudspeakers, said sub- 
set being selected from the plurality of loudspeakers on the basis of the location of 
the virtual sound source in the defined space, and means for applying a cross talk 
cancellation process to the selected sub-set of the loudspeakers. 

10 

2. A sound reproduction system as claimed in Claim 1, having means for 
reproducing at least a primary virtual sound source and a secondary virtual sound 
source, and for delaying the secondary virtual sound source signal with respect to 
the primary source, to simulate a reflection of the primary source. 

15 

3. A sound reproduction system as defined in Claim 1 or Claim 2 wherein the 
subsets of loudspeakers are pairs of loudspeakers. 

4. A sound reproduction system as claimed in claim 3 wherein there are four 
20 loudspeakers arranged substantially at 30° and 120° to left and right of a pre- 
determined centre line, and wherein four sectors are defined bounded by divisions 
at substantially 60° and 1 20° to left and right of the centre line, and wherein for 
virtual sources in the sector bounded by the divisions to 60° left and right of the 
centre line the loudspeakers at 30° from the centre line are selected, for positions 

25 greater than 1 20° to left and right of the centre line the loudspeakers at 1 20° to 
left and right of the centre line are selected, and for intermediate angles the two 
loudspeakers for intermediate angles to the left of the centre line the two left-hand 
loudspeakers are selected and for intermediate angles to the right of the centre line 
the two right-hand loudspeakers are selected. 

30 
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5. A sound reproduction system as claimed in claim 3 wherein there are five 
loudspeakers arranged substantially at 0°, 60° and 120° to left and right of a pre- 
determined centre line, and wherein five sectors are defined bounded by divisions 
at substantially 0°, 10°, and 120° to left and right of the centre line, and wherein 

5 for virtual sources in the sector bounded by the divisions at 0° and 60° left of the 
centre line the loudspeakers at 0° and 60° left from the centre line are selected, 
for virtual sources in the sector bounded by the divisions at 0° and 60° right of the 
centre line the loudspeakers at 0° and 60° right from the centre line are selected, 
for positions greater than 1 20° to left and right of the centre line the loudspeakers 
10 at 120° to left and right of the centre line are selected, and for intermediate angles 
between 10° and 120° left or right of the centre line the two loudspeakers at 60° 
left and right of the centre line are selected. 

6. A system according to claim 3 wherein there are three loudspeakers, 
15 arranged in front of a listening point and at 90° of the centre line to left and right, 

wherein virtual sources to the rear of the user are reproduced using the left and 
right loudspeakers and virtual sources to the front of the user are represented by 
the central loudspeaker and the left or right speaker according to which side of the 
centre line the virtual source is. 

20 

7. A method of sound reproduction for reproducing sound by way of a 
plurality of speakers, the method comprising the steps of determining where, 
within a defined space, one or more virtual sound sources are located and, for each 
virtual sound source, applying a cross talk cancellation process to a sub-set of the 

25 loudspeakers, said sub-set being selected from the plurality of loudspeakers on the 
basis of the location of the virtual sound source in the defined space. 

8. A sound reproduction system as claimed in Claim 7, for reproducing at 
least a primary virtual sound source and a secondary virtual sound source, wherein 

30 the secondary virtual sound source signal is delayed with respect to the primary 
source, to simulate a reflection of the primary source. 
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9. A sound reproduction system as defined in claim 7 or 8 wherein the 
loudspeakers are selected pairwise. 

10. A method of sound reproduction according to claim 7, 8, or 9 wherein a 
5 plurality of virtual sound sources are operated on simultaneously, with cross-talk 

cancellation processes applied to appropriate sub-sets of the loudspeakers for each 
virtual sound source. 

11. A sound reproduction system substantially as decribed with reference to 
10 the drawings. 



12. A method of sound reproduction for reproducing sound substantially as 
decribed with reference to the drawings. 
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